Automatic Speech Recognition for Tunisian Dialect

نویسندگان

  • Ahmed Ben Ltaief
  • Yannick Estève
  • Marwa Graja
  • Lamia Hadrich Belguith
چکیده

Speech recognition for under-resourced languages represents an active field of research during the past decade. The tunisian arabic dialect has been chosen as a typical example for an under-resourced Arabic dialect. We propose, in this paper, our first steps to build an automatic speech recognition system for Tunisian dialect. Several Acoustic Models have been trained using HMM-GMM and HMM-DNN system. The speech corpus has been collected and transcribed from dialogues in the Tunisian Railway Transport Network. The HMM-DNN system can give an impressive relative reduction in WER.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mapping Rules for Building a Tunisian Dialect Lexicon and Generating Corpora

Nowadays in tunisia, the arabic Tunisian Dialect (TD) has become progressively used in interviews, news and debate programs instead of Modern Standard Arabic (MSA). Thus, this gave birth to a new kind of language. Indeed, the majority of speech is no longer made in MSA but alternates between MSA and TD. This situation has important negative consequences on Automatic Speech Recognition (ASR): si...

متن کامل

De l'arabe standard vers l'arabe dialectal : projection de corpus et ressources linguistiques en vue du traitement automatique de l'oral dans les médias tunisiens

In this work, we focus on the problems of the automatic treatment of oral spoken in the Tunisian media. This oral is marked by the use of code-switching between the Modern Standard Arabic (MSA) and the Tunisian dialect (TD). Our goal is to build useful resources to learn language models that can be used in automatic speech recognition applications. As it is a variant of MSA, we describe in this...

متن کامل

GMM-Based Maghreb Dialect IdentificationSystem

While Modern Standard Arabic is the formal spoken and written language of the Arab world; dialects are the major communication mode for everyday life. Therefore, identifying a speaker’s dialect is critical in the Arabic-speaking world for speech processing tasks, such as automatic speech recognition or identification. In this paper, we examine two approaches that reduce the Universal Background...

متن کامل

Automatic Detection of Transition Zones in Tunisian Dialect

This study is an extension of our last researches about the detection of transition zones based on multiresolution spectral analysis (MRS). In this paper we present the fourth step for the realization of an automatic system for Tunisian Dialect segmentation and analysis. The MRS is calculated over several Fast Fourier Transforms (FFT) of different length. It can provide a higher temporal accura...

متن کامل

Building bilingual lexicon to create Dialect Tunisian corpora and adapt language model

Since the Tunisian revolution, Tunisian Dialect (TD) used in daily life, has became progressively used and represented in interviews, news and debate programs instead of Modern Standard Arabic (MSA). This situation has important negative consequences for natural language processing (NLP): since the spoken dialects are not officially written and do not have standard orthography, it is very costl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017